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Abstract 


The recent proliferation of high performance workstations and the increased reli- 
ability of parallel systems have illustrated the need for robust job management sys- 
tems to support parallel applications. To address this issue, NAS compiled a 
requirements checklist for job queuing/scheduling software [Jon96]. Next, NAS 
began an evaluation of the leading job management system (JMS) software pack- 
ages against the checklist. This report describes the three-phase evaluation process, 
and presents the results of Phase 1: Capabilities versus Requirements. We show 
that JMS support for running parallel applications on clusters of workstations and 
parallel systems is still insufficient, even in the leading JMSs. However, by ranking 
each JMS evaluated against the requirements, we provide data that will be useful to 
other sites in selecting a JMS. 


1. MRJ, Inc., NASA Contract NAS 2-14303, Moffett Field, CA 94035-1000 


1 



1.0 Introduction 


The Numerical Aerodynamic Simulation (NAS) supercomputer facility, located 
at NASA Ames Research Center, has been working for the last few years to 
bring parallel systems and clusters of workstations into a true production 
environment. One of the primary difficulties has been identifying a robust job 
management system (JMS) capable of completely supporting parallel jobs. For 
a complete discussion of the role and need of a JMS, see [Sap95]. 

Many JMS software packages exist that cover a wide range of needs, from 
traditional queuing/batch systems to “load-balancing” and “cycle-stealing” 
software for workstations. While many exist, few attempt to support parallel 
jobs. It was to address this deficiency that NAS produced the NAS 
Requirements Checklist for Job Queuing/Scheduling Software [Jon96] (with 
input from the NASA Cooperative Agreement (CAN) NCC3-413 project 
members: NAS, NASA Ames, NASA Langley, NASA Lewis, Pratt Whitney, 
Platform Computing, PBS group; as well as input from Cray Research, Inc. 
(CRI), and IBM). (For a complete description of the cooperative agreement see 
[CAN95].) This list of requirements focuses on the needs of a site which runs 
parallel applications (e.g. message-passing codes) across clusters of 
workstations and parallel systems. However, the requirements attempt to cover 
the gamut from clusters of PCs to MPPs and clusters of Crays. The intent was 
twofold: to provide a baseline set of requirements against which to measure and 
track various JMSs over time; and to provide direction to JMS vendors as they 
plan product improvements. Therefore, the requirements list was published 
separately from this evaluation paper in order to allow vendors the maximum 
amount of time to address the requirements. A condensed summary of the 
requirements is reproduced herein; refer to the original document for a 
complete description of each requirement. 

Recently, there have been several excellent comparisons of job queueing/batch 
software systems, e.g. [Bak95 and Kap94]. The two comparisons cited cover 
most of the vast array of available JMS products. The NAS evaluation differs 
from these in two primary ways. First, NAS chose to evaluate only the four 
leading JMS systems. Second, NAS chose to perform a more in-depth 
comparison with more than twice the number of criteria as the cited evaluations. 

2.0 Evaluation Description 

This paper discusses an evaluation of the leading job management systems in 
order to identify the one(s) that best meet(s) the needs and requirements of NAS. 
The evaluation will proceed in three phases, as shown in Tables 1 and 2. 

After the evaluation plan was written, we identified which JMS software pack- 
ages to evaluate. Table 3 lists the four packages identified, and the versions 
selected for evaluation. 
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TABLE 1. Phases of Comparison 


Phase 1 

Capabilities versus requirements 

Phase 2 

Staff and selected user testing 

Phase 3 

Full deployment, production use 


TABLE 2. Steps in Evaluation 

Phase 1: 

1 . Obtain most recent production release (non-beta) of JMS from each ven- 
dor (see Table 3 below). 

2. Review vendor-supplied documentation for JMS system. 

3. Perform pencil-paper comparison of JMS requirements against stated 
capabilities, assigning “points” according to SCALE (see below). 

4. Provide each vendor an opportunity to review and correct any technical 
errors in the evaluation of their product. 

5. Rank all JMS systems against METRIC (see below) of capabilities against 
requirements. 

6. Any JMS falling below MININUM THRESHOLD (see below) will be elimi- 
nated from comparison; all remaining will continue to Phase 2. 

7. Summarize and publish results. 

Phase 2: (for each JMS meeting minimum requirements) 

A. For each test platform (see Table 4 below) 

1. Install software in test configuration. 

2. Configure and/or write basic job scheduler. 

3. Verify capabilities claimed in vendor-supplied documentation. 

4. Re-score as necessary. 

5. Configure and/or write complex job scheduler. 

6. Run simulated TEST SUITE (see Section 4 below) against JMS. 

7. Open system for staff testing. 

8. Open system for selected user testing. 

9. Solicit feedback from testing. 

B. Test inter-platform JMS capabilities. 

C. Summarize and publish results. 

D. Optionally perform Phase 3 evaluation at this time. 

E. Archive JMS configuration. 

F. Deinstall JMS. 
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TABLE 2. Steps in Evaluation 


Phase 3: (Optional) 

1. Install software in production configuration. 

2. Configure and/or write complete job scheduler with all NAS policies. 

3. Produce all necessary documentation and guides to educate users on 
JMS. 

4. Evaluate under normal user workload for several months. 

Conclusion: 

1 . Produce summary report of findings. 


TABLE 3. JMS Software Selected for Evaluation 


JMS 

Version 

Vendor 

Released 

LoadLeveler (LL) 

v.1.2.1 

IBM 

Aug 95 

Load Sharing Facility (LSF) 

v.2.2 

Platform 

28 Feb 96 

Network Queueing Env (NQE) 

v.2.0 

CRI 

31 Mar 95 

Portable Batch System (PBS) 

v.1.1.5 

NASA 

18 Jan 96 


A general description of each of these products is given in the Phase 1 Results 
section below. 

Next, we generated a rough timeline for the evaluation. Table 4 shows the portion 
of the timeline covered by this paper. (Table 1 1 in Section 5 below gives the 
revised timeline for the conclusion of the project.). 


TABLE 4. Timeline of JMS Evaluation, Phase 1 


Time Period 

Activity 

1 March 1996: 

Cut-off date for vendor release of 
production software* 

1 March - 15 April: 

Phase 1 comparison. 

15 April -15 May: 

Summarize and publish Phase 1 
results. 


Choosing a cut-off date was necessary to set a fixed window of time for the eval- 
uation. The original proposed date was revised to March 1 st in order to include 
the latest versions of LSF and NQE, both of which were scheduled for a major 
release at the end of February 1996. Unfortunately, the NQE release 3.0 slipped 
three months, so the current version 2.0 was evaluated. The next release of Load- 
Leveler is scheduled for Fall 1996. 
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We then determined which computer systems would be used for the second 
phase of the evaluation. The three testbed systems at NAS, listed in Table 5, were 
selected for the diversity and flexibility they provide. Because they are not true 
production systems, we have more latitude with regard to software changes and 
providing staff with dedicated-system time. The three systems differ in their 
workload and job mix, but all three give priority to supporting parallel and mes- 
sage-passing applications. 


TABLE 5. Phase 2 Comparison Platforms 


Architecture 

NAS 

Hostname 

Configuration 

SGI PowerChallenge 

davinci 

8-node (40 CPU) workstation cluster, 1 front end 

CRI J90 

newton 

4-node (20 CPU) cluster, 1 front end 

IBM SP2 

babbage 

160-node (160 CPU) SP2, 2 front ends 


In addition, we determined that the test suite to be used in Phase 2 for evaluating 
each JMS will consist of a combination of the following: 

• A suite of applications including the NAS Parallel Benchmarks (NPBs) 

• Jobs or scripts testing particular features of the JMS 

• Simulated job stream (based on past job accounting data from the SP2) 

The details of the test suite will be determined prior to beginning Phase 2. 

While the main focus of Phase 1 was to compare capabilities of the selected 
products, we also wanted a way to eliminate from Phase 2 any JMS that did not 
meet a minimum number of our requirements; it would not be worthwhile to per- 
form the level of evaluation required in Phase 2 on products that did not meet 
enough of our needs. 

Since the list of requirements was divided into three main categories: absolute 
requirements, recommended capabilities, and future requirements, we decided to 
use the absolute requirements (those listed in the requirements checklist in sec- 
tion 3 below) for the elimination metric. Each of those requirements was further 
ranked as high or medium priority. From this we generated the following simple 
metric, a percentage index for the number of section 3 criteria met, taking the 
priority into consideration: 

[ sum ( “score” * “priority”) ] / max possible * 100 

We next determined what the “minimum threshold” would be: any JMS ranking 
below 90 percent on the above metric will be eliminated from the Phase 2 com- 
parison as not meeting enough of the base requirements.With these details 
decided, we proceeded with the Phase 1 evaluation. 
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The following section gives an abbreviated list of the requirements used in the 
evaluation. Again, we suggest a review of the evaluation data with a copy of the 
complete requirements. 


3.0 Condensed Requirements List 

Job Management System 
High Priority 

3.1.1 Must operate in a heterogeneous multi-computer environment... 

3.1.2 Must integrate with frequently used distributed file systems... 

3.1.3 Must possess a command line interface to all modules of the JMS... 

3.1.4 Must include a published application programming interface (API) to 
every component of the JMS... 

3.1.5 Must be able to enforce resource allocations and limits... 

3.1.6 Software must permit multiple versions on same system... 

3.1.7 Source code must be available for complete JMS... 

3.1.8 Must bee able to define more than one user id as JMS administrator... 

Medium Priority 

3.1.9 Must provide a means of user identification outside the password file... 

3.1.10 Must be scalable... 

3.1.1 1 Must meet all requirements of appropriate standards... 

Resource Manager Requirements 
High Priority 

3.2.1 Must be “parallel aware,” i.e. understand the concept of a parallel job 
and maintain complete control over that job... 

3.2.2 Must be able to support and interact with MPI, PVM, HPF... 

3.2.3 Must provide file “stage-in” and “stage-out” capabilities... 

3.2.4 Must provide user-level checkpointing/restart... 

Medium Priority 

3.2.5 Must provide a history log of all jobs... 

3.2.6 Must provide asynchronous communication between application and 
Job Manager via a published API... 

3.2.7 Must be integrated with authentication/security system... 

3.2.8 Interactive-batch jobs must run with standard input, output, and error 
file streams connected to a terminal... 
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Scheduler Requirements 
High Priority 

3.3.1 Must be highly configurable... 

3.3.2 Must provide simple, out-of-the-box scheduling policies... 

3.3.3 Must schedule multiple resources simultaneously... 

3.3.4 Must be able to change the priority, privileges, run order, and resource 
limits of all jobs, regardless of the job state... 

3.3.5 Must provide coordinated scheduling... 

Medium Priority 


3.3.6 Must provide mechanism to implement any arbitrary policy... 

3.3.7 Must support unsynchronized timesharing of jobs... 

3.3.8 Sites need to be able to define specifics on time-sharing... 

Queuing System Requirements 

High Priority 

3.4.1 Must support both interactive and batch jobs with a common set of 
commands... 

3.4.2 User Interface must provide specific information... 

3.4.3 Must provide for restricting access to the batch system using a variety 
of site-configurable methods... 

3.4.4 Must be able to sustain hardware or system failure... 

3.4.5 Must be able to configure and manage one or more queues... 

3.4.6 Administrator must be able to create, delete, and modify resources 
and resource types... 

3.4.7 Administrator must be able to change a job’s state... 

3.4.8 Must allow dynamic system reconfiguration by administrator with 
minimal impact on running jobs... 

3.4.9 Must provide centralized administration... 

3.4.10 Users must be able to reliably kill their own job... See 3.2.1 above. 

Medium Priority 

3.4.11 Must provide administrator-configurable programs to be run by JMS 
before and after a job... 

3.4.12 Must include user specifiable job interdependency... 

3.4. 1 3 Must allow jobs to be submitted from one cluster and run on another... 

3.4.14 Must provide a site-configurable mechanism.. .to permit users to have 
access to information about jobs from other submitters... 
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Requested Capabilities 
High Priority 

4.1.1 Job scheduler should support dynamic policy changes... 

4. 1 .2 Possess a Graphical User Interface (GUI) to JMS... 

4.1.3 Provide a graphical representation of the configuration and usage of 
the resources under the JMS... 

Medium Priority 

4. 1 .4 The time-sharing configuration information should be available to the 
job scheduler for optimizing job scheduling... 

4.1.5 Provide a graphical monitoring tool with the specified capabilities... 

4. 1 .6 Support both hard and soft limits when appropriate... 

4.1.7 Should be readily available with full, complete support... 

4. 1 .8 Should supply some kind of a proxy account optional setup... 

4.1.9 Should provide specified accounting capabilities... 

Low Priority 

4.1.10 Should allow a site to choose to run separate resource managers for 
each system (or cluster), as well as a single resource manager for all 
systems... 

4.1.1 1 Should allow owner of interactive jobs to “detach” from the job... 

4.1.12 Should provide a mechanism to allow reservations of any resource... 

4.1.13 Should provide specific attributes for jobs... 

4.1.14 Should be able to define and modify a separate access control list for 
each supported resource.... 

4.1.15 Should provide wide area network support... 

4.1.16 Should allow an interactive user on a workstation console to instruct 
the JMS to suspend or migrate a job to a different workstation... 

4. 1 . 17 Should provide both client and server capabilities for Windows NT... 

Future Requirements 
High Priority 

5.1.1 Should provide gang-scheduling... 

5.1.2 Should provide dynamic load balancing... 

5.1.3 Should provide job migration... 

Medium Priority 

5.1.4 Should inter-operate with OS level checkpointing, providing the 
ability for the JMS to restart a job from where it left off and not 
simply from the beginning.... 
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4.0 Phase I Results 


The results of Phase 1: Capabilities versus Requirements for the products evalu- 
ated are provided below. A description of each product is provided followed by 
its evaluation. As indicated in Table 2 above, each vendor was given the opportu- 
nity to review and correct any technical inaccuracies in the evaluation of their 
product. It should be noted that CRI did not accept this opportunity. 

Table 6 lists the definitions of “scores” for each requirement. Note that instead of 
performing a “yes/no” or “has/has not” comparison, we attempt to determine 
how much of each requirement the JMS meets. The result for each requirement 
is presented in a single “score” accompanied by a short explanatory note. The 
notes are not intended to replace the description of the requirements. A copy of 
NAS Requirements Checklist for Job Queuing/Scheduling Software [Jon96] is 
required to interpret the evaluation data. 


Table 6: Score Definitions 


Score 

Explanation 

• 

Meets requirement 

3 

Meets most of requirement 

3 

Meets roughly half of requirement 

0 | 

Meets little of requirement 

O 

Does not meet any of requirement 


4.1 LoadLeveler (LL) 

Loadleveler, from IBM, is a commercially available, general-purpose JMS soft- 
ware package. Emphasis is currently on clusters of workstations running single 
serial jobs. Some support for parallel jobs is provided, but is limited to SP sys- 
tems where the Parallel Operating Environment (POE) is available. Extensive 
support for parallel jobs (include non-SPs) is scheduled for the Fall 1996 release. 
Information for this evaluation is based on [EBM95a, EBM95b]. Additional infor- 
mation is online: (http://spud-web.tc.comell.edu/hn/frame/LL.html). 


Table 7: Loadleveler 1.2.1 


Requirement 

Score 

Notes 

3.1.1 

3 

SP2, RS/6000, SUN, SGI, HP; no support for CRI 
UNICOS (one of the evaluation platforms) 
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Table 7: Loadleveler 1.2.1 


Requirement Score 



NFS and AFS only; DFS/DCE due 1Q97 


has command line interface 


API for accounting, prologue, epilogue, checkpoint 
(serial), submit, monitor; scheduler API due 3Q96 


not provided: wall-clock time (due 3Q96) 
provides per-process, not per-job: memory utilization; 
swap, dedicate/shared access 


via different port numbers and file tree 


source-code available for a price 


multiple managers, no operators 


insufficient user identification mechanisms 


in use at Cornell: 512 nodes; another site: 800+ nodes 


does not meet POSIX 1003.2d, “Batch Queueing 
Extensions” standard 


does not track all subprocesses, forward signals, or 
provide job-JMS communication for job-start 
accounting is questionable; tracks parent- wait3-chi Id 
processes only 


“supports” but does not interact with MPI, PVM, 
HPF 


suggests use of prologue/epilogue to copy files, but 
no automatic file staging as required 


system-level check-point/restart where supported by 
OS; JMS assisted user-level checkpointing for serial 
jobs only 


combination of UNIX accounting data and LL 
generated data (no suspended execution data) 


application-JMS communication not available 


UNIX-level security only; DCE support in 1Q97 


does not support batch-scheduled interactive jobs 
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Table 7: Loadleveler 1.2.1 



does not support dynamic & pre-emptive resource 
allocation; only distinguishes batch and interactive 
jobs 


capable of all except “fair-share”; need to be 
configured before use 


scheduler supports all listed, except supports only one 
file-system (execution directory) 


cannot change running jobs 


supports space-sharing 


scheduler not separable from JMS; no API for 
scheduler (due 3Q96) 


supports unsynchronized timesharing 


via local configuration in MACHINE stanza 


handles both interactive and batch 


does not provide resources consumed for running 
jobs or for subprocesses of parallel jobs; no status of 
system resources 


provided 


jobs (except interactive) are automatically 
requeued/resumed/rerun in event of system failure. 


provided 


provided 


provided 


can add/delete nodes; can request each daemon 
re-read its configuration files 


commands are centralized, log and accounting files 
are distributed, but tools are provided to combine 
remote logs into single log 


if subprocesses of parallel jobs are not controlled, 
then JMS cannot guarantee to kill processes 
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Table 7: Loadleveler 1.2.1 



3.4.12 



provided 


job dependencies limited to “job-steps” 
(steps/statements within a job) rather than “jobs” 


provided 


provided 


allows reconfiguration of JMS scheduler without 
affecting rest of JMS 


has GUI “to all functions” (LL. Summary p.4) 


no graphical system configuration tool 


no MACHINE stanza for this (due ‘97) 


no graphical monitoring tool (suggests using separate 
product, “Performance Toolbox/6000”) 


supports hard limits (wall-clock); allows user-speci- 
fied simple soft limit; limits do not take into consider- 
ation multi-node parallel jobs; focused on “job steps’ 


supported by large software company 


via USERS stanza 


JMS accounting provides some of the data and some 
tools to process it 


provided 


cannot detach/reattach; plus no concept of “interac- 
tive-batch” 


no resource reservations 


doesn’t accurately track all parallel job resource 
consumption or limits 


ACL only for selected resources (e.g. hosts) 


distance not an issue as long as network is stable and 
reliable 


no workstation owner-JMS interaction 
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Table 7: Loadleveler 1.2.1 


Requirement 

Score 

Notes 

4.1.17 

o 

no Windows NT support 

5.1.1 

o 

no gang-scheduling 

5.1.2 

o 

no dynamic load-balancing 

5.1.3 

0 

only for serial jobs 

5.1.4 

0 

only for serial jobs 


4.2 Load Sharing Facility (LSF) 

LSF, the Load Sharing Facility, from Platform Computing Corporation., is a 
commercially available, general-purpose JMS software package. Emphasis is on 
providing a single package for all needs, but focuses on load balancing and 
“cycle-stealing”. Only limited parallel job support is provided. Extensive support 
for parallel jobs is due in a late 1996 release. Information for this evaluation is 
based on [Pla96a, Pla96b, Pla96c]. Additional information is available online: 
(http://www.platform.com). 


Table 8: LSF 2.2 


Requirement 

Score 

Notes 

3.1.1 

• 

Currently: ConvexOS, UNICOS, OSF/1, HP-UX, 
AIX, Linux, NEC EWS OS, Solaris, SunOS, Sony 
NEWS 

3.1.2 

• 

provided 

3.1.3 

• 

commands well documented 

3.1.4 

D 

general API provided (not for scheduler) 

3.1.5 

0 

no support for disk usage, swap, network 

3.1.6 

• 

via different port numbers 

3.1.7 

• 

available on specific-case basis 

3.1.8 

• 

provides primary administration, and queue-level 
administration 

3.1.9 

• 

provides site-configurable authentication on 
per-queue level 
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Table 8: LSF 2.2 


Requirement 

Score 

Notes 

3.1.10 

D 

claims scalability to above 200 hosts 

3.1.11 

o 

does not meet POSIX 1003. 2d “Batch Queueing 
Extensions” standard 

3.2.1 

0 

aware of needs, but all tools directed at sequential, 
serial jobs 

3.2.2 

0 

supports, but does not interact 

3.2.3 

D 

users can do file-staging via user-level pre-execution 
capability; includes tests for check/requeue 

3.2.4 

3 

system-level check-point/restart where supported by 
OS; JMS-assisted, user-level checkpointing for serial 
jobs only 

3.2.5 

a 

meets all except those listed in 3.1.5 above 

3.2.6 

o 

no published job- JMS API 

— 1 

3.2.7 

3 

has some DCE support; site configurable 

3.2.8 

O 

no support for batch-scheduled interactive sessions 

3.3.1 

3 

not highly configurable (must use provided schedul- 
ing algorithms); no concept of interactive-batch 

3.3.2 

3 

has many of those listed 

3.3.3 

• 

can configure via HOST stanza 

3.3.4 

a 

once running, observable resources only; other job 
states: yes 

3.3.5 

a 

supports space-sharing 


3.3.6 


o 


scheduler not separable; no scheduler API 


3.3.7 

3.3.8 


provided 


via job limits per host 


3.4.1 


0 


handles both, but does not provide common 
command set 


3.4.2 


no remaining resource tracking 
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Table 8: LSF 2.2 


Requirement 

Score 

Notes 


• 

provided 


B 

jobs (except interactive jobs) are automatically 
requeued/resumed/rerun in event of system failure 

^3 


provided 

mm 

• 

provided 

3.4.7 

• 

provided 

3.4.8 

• 

provided 

3.4.9 

• 

administration and logs can be centralized (via shared 
filesystem) 

3.4.10 

O 

does not have full parallel awareness, therefore 
cannot “reliably kill” job subprocesses 

3.4.11 

• 

provided 

3.4.12 


meets all “status of other computer system” 

3.4.13 

• 

provided 

3.4.14 

o 

not configurable; default is “all users can see all other 
users jobs” 

4.1.1 

• 

allows reconfiguration of JMS scheduler without 
affecting rest of JMS 

4.1.2 

• 

GUI for all modules 

4.1.3 


one window per cluster 

4.1.4 

• 

via HOSTS stanza 

4.1.5 

? 

captures snapshot via external program such as xv 

4.1.6 

3 

supports hard limits only 

4.1.7 

3 

very popular package for cycle stealing and load 
balancing 

4.1.8 

• 

Create shared account(s) for LSF jobs to run under, 
restrict access via configuration file 
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Table 8: LSF 2.2 


Requirement 

Score 

Notes 

4.1.9 

3 

JMS provides some requested data in ascii format, 
and simple tool to process records 

4.1.10 

0 

cannot schedule multiple “clusters” with single 
server; vendor suggests putting all machines to be 
scheduled into single “cluster” 

4.1.11 

O 

cannot detach/reattach; plus no concept of 
“interactive-batch” 

4.1.12 

o 

no resource reservation 

4.1.13 

3 

no resource consumption counters 

4.1.14 

0 

controls access to JMS, specific hosts, classes of 
hosts, and queues only 

4.1.15 

• 

distance not an issue as long as network is stable and 
reliable 

4.1.16 

0 

only indirectly; if load on system goes up, JMS may 
reallocate resources 

4.1.17 

O 

no Windows NT support 

5.1.1 

o 

no gang-scheduling 

5.1.2 

o 

no dynamic load-balancing 

5.1.3 

3 

provides only for serial jobs where supported by OS 

5.1.4 

• 

provided 


4.3 Network Queueing Environment (NQE) 

NQE, the Network Queueing Environment, from the CraySoft division of Cray 
Research Inc., is a commercially available, general-purpose JMS software pack- 
age. Emphasis is currently on JMS support of large CRI machines, but also pro- 
vides batch queueing for clusters of workstations running single serial jobs. 
Initial support for parallel jobs arrived with July 1996 release, too late to be 
included in this evaluation. Information for this evaluation is based on [Cra95a, 
Cra95b, Cra95c]. Additional information on the latest release is available online: 

(http://www.cray.com/PUBLIC/product-info/sw/nqe/nqe30.html). 
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Table 9: NQE 2.0 


Requirement 

Score 

Notes 

3.1.1 

• 

Solaris, SunOS, IRIX, AIX, HP-UX, DEC OSF/1, 
UNICOS 

3.1.2 

(3 

NFS support only 

3.1.3 

• 

has command-line interface 

3.1.4 

D 

API to “all” components 

3.1.5 

3 

supports: number CPUs, CPU time, memory, disk 

3.1.6 


via different port numbers 

3.1.7 

|Q 

source code available for a negotiable price 

3.1.8 

|^J| 

provided 

3.1.9 

1^1 

provided 

3.1.10 

0 

no explanation of extent of scalability 

3.1.11 

O 

does not meet POSIX 1003.2d, “Batch Queueing 
Extensions” standard 

3.2.1 

o 

due in v.3.0 (July 96) 

3.2.2 

0 

supports PVM 

3.2.3 


provides a “file-transfer agent” to move data from 
system to system, with fault tolerance 


0 

system-level checkpoint/restart where supported by 
OS; no JMS-assisted user-level checkpointing 

3.2.5 

0 

very limited accounting logs, appears to rely on UNIX 
accounting for most data 

3.2.6 

O 

no application- JMS communication available 

3.2.7 

0 

no indication of AFS/DFS/DCE support 

3.2.8 

O 

no concept of “interactive-batch” 
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Table 9: NQE 2.0 




0 doesn’t support dynamic & preemptive resource 

allocation; only distinguishes batch and interactive 
jobs 



scheduler (and underlying NQS) can support some 
listed 


once running, observable resources only; other job 
states: yes 


supports space-sharing 


O scheduler not separable from JMS; no API for 
scheduler - due 3Q96 


supports unsynchronized time-sharing 



handles both interactive and batch jobs 


does not provide the following: why not running, 
consumed/ remaining resources, allocated/requested 
resources, state of all 


not all restrictions 


provided 


provided 



only before job is started 




3.4.12 


no parallel awareness 


no prologue/epilogue support 


no status of other computer systems 
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Table 9: NQE 2.0 


Requirement 


Score 


Notes 


3.4.13 


3.4.14 


access restrictions apply 


3 


all or nothing configurable 


4.1.1 


4.1.2 


4.1.3 


4.1.4 


4.1.5 


3 


limited 


motif/X and WWW 


o 


no graphical system configuration tool 


o 


none 


o 


no graphical monitoring tool 


4.1.6 


4.1.7 


3 


hard limit: yes; soft limit: no 


based on NQS — old de facto standard 


4.1.8 


4.1.9 


4.1.10 


via shared account and ACLs 


| much of necessary data provided, no tools to process 
data however 


limited 


4.1.11 


o 


cannot detach/reattach; plus no concept of 




“interactive-batch” 

4.1.12 

3 

has SRFS support, but no other 

4.1.13 

3 

no computation counters 

4.1.14 

O 

no ACLs 

4.1.15 

• 

distance not an issue as long as network is stable and 
reliable 

4.1.16 

o 

no workstation owner-JMS interaction 

4.1.17 

o 

no Windows NT support 

5.1.1 

o 

no gang-scheduling 

5.1.2 

o 

no dynamic load-balancing 

5.1.3 

o 

no job migration support 
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Table 9: NQE 2.0 


Requirement 

Score 

Notes 

5.1.4 

• 

where supported by OS 


4.4 Portable Batch System (PBS) 

PBS, the Portable Batch System, developed and maintained by the NAS Facility 
at NASA Ames Research Center, is a freely available, general-purpose JMS soft- 
ware package. Emphasis is on providing a single package for all needs, but 
focuses on support for high-performance computing (e.g. supercomputers and 
clusters of workstations). Extensive support for parallel jobs is due in a Septem- 
ber 1996 release, with support for dynamic resource management due in January 
1997 release. Information for this evaluation is based on [Hen96a, Hen96b]. 
Additional information is available online: (http://www.nas.nasa.gov/NAS/PBS). 


Table 10: PBS 1.1.5 


Requirement 

Score 

Notes 

3.1.1 

• 

Currently: IRIX, AIX, UNICOS, SunOS, Solaris, 
CM5, SP2, CRAY C90, J90 

3.1.2 

0 

NFS support only; DCE/DFS support (due 4Q96) 

3.1.3 

• 

commands well documented and explained 

3.1.4 

EM 

API well -documented and explained 

3.1.5 


network adapter access enforcement only if OS 
makes it observable 

3.1.6 

• 

implemented via different port numbers and 
directories 

3.1.7 

• 

source freely available 

3.1.8 

• 

provides both manager and operator IDs 

3.1.9 

• 

provides ACL in addition to /etc/passwd; could use a 
single generic account and control all user access via 
ACLs 

3.1.10 

o 

in production use on a 1 60-node SP2 at NAS 

3.1.11 

EM 

provided 
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Table 10: PBS 1.1.5 


Requirement 
32 A 


3.2.2 


3.2.3 

3.2.4 


3.2.5 


3.2.6 


3.2.7 


3.2.8 

3.3.1 


3.3.2 


3.3.3 

3.3.4 


3.3.5 

3.3.6 


3.3.7 

3.3.8 
3.4.1 


3.4.2 


Score 

Notes 

o 

capability will be included in “full parallel 
awareness” (due 4Q96) 

3 

“supports” but does not “interact”; capability will be 
included in “dynamic parallel awareness” (due 1Q97) 

• 

provided 

3 

system-level checkpoint/restart where supported by 
OS; no JMS assisted user-level checkpointing; will be 
included in “full parallel awareness” (due 4Q96) 

3 

meets all except a couple of the resources specified in 
3. 1 .5 expect complete resource accounting; with “full 
parallel awareness” (due 4Q96) 

O 

capability will be included in “dynamic parallel 
awareness” (due 1Q97) 

(3 

UNIX-level security only; DCE support (due 4Q96) 

• 

provided 

• 

administrator must write scheduler specific to site, or 
use/modify one provided 

3 

several complex schedulers included, but not all listed 

• 

scheduler can support all listed 

3 

once running, observable resources only ; other job 
states: yes 

• 

supports space-sharing 

• 

scheduler can be written in tel, C, or PBS scripting 
language 

• 

provided 

• 

via PBS nodefile 

• 

“qsub -I” indicated interactive, all other options are 
the same as for batch jobs 

3 

meets all except CPU consumption of subprocesses 
of parallel jobs not currently provided; (due with “full 
parallel awareness” 4Q96) 





























Table 10: PBS 1.1.5 


Requirement 

3A3 

3.4.4 


3.4.5 

3.4.6 

3.4.7 

3.4.8 


3.4.9 

3.4.10 


3.4.11 

3.4.12 


3.4.13 

3.4.14 

4.1.1 

4.1.2 

4.1.3 

4.1.4 

4.1.5 


4.1.6 



Score 


Notes 


provided 


jobs (except interactive jobs) are automatically 
requeued/resumed/rerun in event of system failure 


provided 


provided 


provided 


3 


O 


can add/delete nodes from defined pool; cannot 
redefine pool without JMS stop/restart 


all logs are located on server host 


capability will be included in “full parallel 
awareness” (due 4Q96) 


provided 


meets all except “status of other computer systems’ 


provided 


provided 


provided 


o 


user and operator GUI due 4Q96 


o 


no graphical system configuration tool 


via PBS nodefile 


o 


no graphical monitoring tool 


3 


supports hard limits only 


3 


public domain 


create shared account(s) for PBS jobs to run under, 
and restrict access via ACLs 


3 


JMS accounting provides much of the necessary data, 
but no tools to process the data 

provided 
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Table 10: PBS 1.1.5 


Requirement 

Score 

Notes 

4.1.11 

o 

cannot detach/reattach 

4.1.12 

• 

via scheduler; currently doing node reservation on 
SP2, and disk reservation via SRFS on C90 

4.1.13 

• 

provided 

4.1.14 

• 

server provides ACLs for restricting/allowing access 
to PBS; scheduler can provide ACLs for any other 
resources 

4.1.15 

• 

distance not an issue as long as network is stable and 
reliable 

4.1.16 

o 

no workstation-owner interaction 

4.1.17 

o 

no Windows NT support 

5.1.1 

o 

no gang-scheduling support 

5.1.2 

o 

first part will be “full parallel awareness” (due 4Q96) 

5.1.3 

o 

first part will be “full parallel awareness” (due 4Q96) 

5.1.4 

• 

where supported by OS (e.g. UNICOS) 


5.0 Conclusions 

Now that the first phase of the evaluation is complete, we feel the information 
and data contained in this report will prove useful to both JMS customers and 
vendors. 

The method of the evaluation proved successful, as did allowing each vendor to 
review the evaluation results of their product for technical accuracy. The docu- 
mentation review illustrated to at least one vender that their documentation 
needed serious attention before the next release. This will benefit existing and 
future customers alike. 

In analyzing the data collected from the evaluation, we found that none of the 
leading JMS packages yet meet enough of our requirements. Both from the eval- 
uation experience and from actually applying the metric described in section 2 
we found that none of the JMSs evaluated meet our minimum number of criteria 
threshold. In fact, if we were to drop the threshold from 90 percent to 80 percent, 
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only one JMS would meet the criteria. The four JMS were ranked, highest to 
lowest: PBS, LSF, LL, and NQE. 

Note that this threshold metric was intended only to eliminate less capable JMSs 
from the Phase 2 evaluation. We needed a metric to draw a line between “pass” 
and “fail”. It should not be used as an overall comparison of the products, 
because not all sites have the same needs. Site who use this data are encouraged 
to select only the criteria important to them, in order to better understand how 
each product compares against their needs. 

While the bad news is the confirmation of a continuing lack of JMS support for 
parallel applications, parallel systems, and clusters of workstations, the good 
news is that this year will be an interesting one for JMS functionality. All the 
major players will be releasing JMS versions with some amount of parallel sup- 
port by the end of 1996. It is anticipated that by late fall 1996 all four products 
evaluated will have responded to this evaluation with increased support for paral- 
lel applications — even beyond what they have currently planned. 

However, due to the current lack of capability across the market, we have 
decided to postpone Phase 2 of the evaluation until the products are more mature. 
When we feel the market has matured sufficiently, we will perform the Phase 1 
evaluation again, and then continue through the complete evaluation as described 
in Table 2 above. Assuming the product release schedules announced by the var- 
ious vendors hold firm, Table 1 1 shows the revised timeline. 


TABLE 11. Revised Timeline of JMS Evaluation 


Time Period 

Activity 

1 Sept - 1 Oct 

Repeat Phase 1 comparison 

1 Oct - 1 Nov 

Summarize and publish Phase 1 
results 

1 Nov - 31 Dec 

Phase 2 comparison 

1 Jan - 15 Jan 

Summarize and publish Phase 2 
results 

15 Jan - 31 May 

Optional Phase 3 comparison; 
assumes two month evaluation of 
each product selected for Phase 3 


The entire evaluation process is expected to be repeated until the market success- 
fully produces a product that meets the needs of sites around the world. 
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